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REAL PARTY IN INTEREST 

This patent application is currently owned by Koninklijke Philips Electronics N.V., as 
indicated by an assignment recorded on April 24, 2001 in the Assignment Records of the U.S. Patent 
and Trademark Office at Reel 01 1773, Frame 0026. 

RELATED APPEALS AND INTERFERENCES 

There are no known appeals or interferences that will directly affect, be directly affected by, 
or have a bearing on the Board's decision in this pending appeal. 

STATUS OF CLAIMS 

Claims 1-20 have been rejected pursuant to an Office Action dated December 29, 2003. 
Claims 1,2,4-6,8-10, 12-14, 17 and 18 were rejected under 35 U.S.C. § 102(b) as being anticipated 
by de Haan, G., et al., "True-Motion Estimation with 3-D Recursive Search Block Matching," 
IEEE Transactions on Circuits and Systems for Video Processing, vol. 3, no. 5, pp. 368-79 
(October 1993). ("de Haan"). Claims 3, 7, 1 1, 15, 16, 19 and 20 were rejected under 35 U.S.C. 
103(a) as being unpatentable over de Haan. 

Claims 1-20 are presented for appeal. Claims 1-20 are shown in Appendix A. 

STATUS OF AMENDMENTS 

The Appellants submitted an Amendment Under 37 C.F.R. § 1 . 1 16 on February 12, 2004. 
The Amendment amended Claims 1-4. The Examiner entered the proposed amendments but found 
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the application not to be in condition for allowance because all the limitations had been previously 
addressed. 

SUMMARY OF INVENTION 

According to one embodiment, a video signal processor evaluates candidate vectors of 
enhancement algorithms utilizing an error function biased towards spatio-temporal consistency with 
a penalty function. {Application, Page 7, lines 3-9). Exemplary categories of video enhancement are 
restoration of "lost" (image/video) information, elimination of artifacts, and enhancement of selected 
image/video characteristics. {Application, Page 14, lines 7-11). Specific examples of video 
enhancement are resolution enhancement and edge enhancement. {Application, Page 14, lines 11- 
15; Page 5, lines 14-17). 

In one embodiment of the invention, a video signal processor 201 includes an enhancement 
vector estimator 202 and enhancement processor 203 which perform the video enhancement 
processing. {Application, Figure 2; Page 13, line 21, through Page 14, line 1). The enhancement 
vector estimator 202 includes one or more caches 205a-205n for temporary storage of pixel 
information, one or more block enhancement units 206a-206n, an enhancement vector memory 207, 
and a best enhancement selection unit 208 which identifies and selects the best enhancement on a per 
block basis. {Application, Figure 2; Page 15, lines 4-11). 

In one example of the operation of the invention, a set of enhancement algorithms, W t {) are 
provided to enhance a block of pixels within a received field. {Application, Page 18, lines 5-10; 
Page 16, lines 3-10). Enhanced video information is created by summing the weighted results of 
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each of the enhancement algorithms as applied to received video information. {Application, Page 
18, lines 3-4). The coefficients used as weights for the algorithms are collected in an enhancement 
vector V. (Application, Page 18, lines 1-2). Several candidate enhancement vectors are compared 
by calculating an error function e for each candidate vector and the candidate vector yielding the 
smallest error value is selected as the enhancement vector for the block under consideration. 
(Application, Page 20, lines 14-17). 

The video signal processor executes process 400 in order to enhance received video 
information, according to one embodiment of the invention. (Application, Figure 4; Page 21, lines 
4-10). The signal processor enhances each block of pixels within the received video information 
utilizing each of a plurality of candidate enhancement vectors. (Application, Figure 4, step 403; 
Page 21, lines 14-21). The signal processor then computes an error function value for the block as 
enhanced by each candidate enhancement vector. (Application, Figure 4, step 404; Page 21, line 22, 
through Page 22, line 2). The candidate vector corresponding to the lowest error function value is 
selected to enhance that block in the displayed field. (Application, Figure 4, step 405; Page 22, lines 
2-4). 



STATEMENT OF ISSUES 

(1) Whether Claims 1, 2, 4-6, 8-10, 12-14, 17 and 18 are unpatentable under 35 U.S.C. § 

102(b). 

(2) Whether Claims 3, 7, 11, 15, 16, 19 and 20 are unpatentable under 35 U.S.C. § 

103(a). 
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GROUPING OF CLAIMS 

Pursuant to 37 C.F.R. § 1.192(c)(7), the Appellants request that all claims be grouped 
together for purposes of this appeal. 

ARGUMENT 
ISSUE 1 - REJECTION UNDER 35 U.S.C. S 102 

The rejection of Claims 1, 2, 4-6, 8-10, 12-14, 17 and 18 under 35 U.S.C. § 102(b) is 
improper and should be withdrawn. 

A. OVERVIEW 

Claims 1, 2, 4-6, 8-10, 12-14, 17 and 18 were rejected under 35 U.S.C. § 102(b) as being 
anticipated by DE Haan, G., ET al., "True-Motion Estimation with 3-D Recursive Search Block 
Matching," IEEE Transactions on Circuits and Systems for Video Processing, vol. 3, no. 5, 
pp. 368-79 (October 1993). ("de Haan"). 

A copy of the claims is provided in Appendix A. A copy of de Haan is provided in 
Appendix B. 

B. STANDARD 

A prior art reference anticipates the claimed invention under 35 U.S.C. § 102 only if every 
element of a claimed invention is identically shown in that single reference, arranged as they are 
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in the claims. (MPEP § 2131; In re Bond, 910 F.2d 831, 832, 15 U.S.P.Q.2d 1566, 1567 
(Fed. Cir. 1990)). Anticipation is only shown where each and every limitation of the claimed 
invention is found in a single prior art reference. (MPEP § 2131; In re Donohue, 766 F.2d 531, 534, 
226 U.S.P.Q. 619, 621 (Fed. Cir. 1985)). 

C. THE deHaan REFERENCE 

The context of the de Haan reference is described in the Appellants' application, in the 
Background of the Invention. In a situation where a television display has a higher screen refresh 
rate (field rate) than a received program signal, a field rate conversion must be performed. 
(Application, Page 2, lines 15-22). Simply repeating received fields to achieve this field rate 
conversion results in moving objects appearing slightly displaced from their expected positions in the 
repeated fields. (Application, Page 3, lines 6-11). Figure 5 of the Application illustrates this effect 
for an object moving linearly across the screen within a sequence of three fields: n-2, n-1 and n. . 
(Application, Page 3, lines 12-14). Field rate conversion by repetition produces intermediate fields 
having the object at positions 503a, 503b and 503c, rather than at the expected positions 502a, 502b, 
and 502c. (Application, Figure 5; Page 3, lines 14-19). To reduce this problem, motion 
compensation techniques, such as that described in the de Haan reference, are utilized to produce 
interpolated intermediate fields for insertion between received fields. (Application, Page 4, lines 
14-23). 

The De Haan paper describes a "motion estimation method specifically suitable for field rate 
conversion" in consumer television applications, (de Haan, p. 368). The method is a recursive 
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block-matching algorithm with a limited number of candidate motion vectors per block. (Abstract). 
Each candidate motion vector is evaluated by calculating a sum of absolute differences (SAD) error 
value, which compares the luminance function of each pixel in the block under consideration with 
the pixels in a previous field at a position offset by the candidate vector. (Equations (4)~(6) and 
associated text). The candidate vector with the smallest SAD error value is used to displace all 
pixels in the block to a new position in the interpolated intermediate field. (Section IV, second 
sentence). 



D. CLAIMS 1, 2, 4-6, 8-10, 12-14, 17 and 18 

Independent Claim 1 is representative of independent Claims 5, 9, 12 and 17 and serves to 
illustrate novel limitations of the Appellants' invention: 

1 . A receiver, including a video enhancement mechanism for 
enhancing video information with spatio-temporal consistency comprising: 

at least one enhancement unit enhancing a characteristic other than 
position of a selected pixel region of video information utilizing at least one 
candidate enhancement vector of enhancement algorithms to generate an 
enhanced pixel region for each candidate enhancement vector, each said enhanced 
pixel region equivalent to enhancement of said selected pixel region utilizing a 
respective candidate enhancement vector of enhancement algorithms; and 

a selection unit computing an error for each said enhanced pixel region 
utilizing a bias towards spatio-temporal consistency of a respective enhanced 
pixel region with spatially adjacent pixel regions in a picture containing said 
selected pixel region and with a counterpart pixel region in one or more pictures 
successive with said picture containing said selected pixel region, said selection 
unit selecting an enhanced pixel region having a best enhancement for spatio- 
temporal consistency. 
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The Appellants respectfully assert that the above-emphasized limitations are not shown in the 
de Haan reference. The de Haan reference is specifically concerned with a motion estimation 
algorithm for calculating the displacement of pixels in interpolated intermediate fields. Candidate 
motion vectors are considered, and the motion vector with the smallest SAD error value is used as a 
displacement vector to reposition all pixels in a block under consideration. As such, de Haan 
teaches a method of compensating for motion in a frame rate conversion process by adjusting the 
position of pixels in intermediate fields. 

Furthermore, the de Haan reference does not disclose enhancement vectors of enhancement 
algorithms. Although de Haan uses candidate vectors, they are candidate motion vectors. These 
candidate vectors are used to find the best correlation between a block of pixels in the current field 
and a corresponding set in the previous field. As such, the vectors have nothing to do with 
enhancement, as defined in the Appellants' specification. The candidate vectors of the de Haan 
reference define the relationship between subsequent fields with respect to motion. A sum of 
absolute differences is used as a metric to determine the best candidate vector. The motion estimator 
is not enhancing any characteristic of a pixel region; it is simply determining a positional 
relationship between fields. 

In independent Claims 1, 5, 9, 12 and 17, the candidate vectors are enhancement vectors of 
enhancement algorithms. The candidate vectors of the Appellants' invention are coefficients of 
enhancement algorithms, rather than displacement vectors describing a positional relation between 
subsequent fields (as taught in the de Haan reference). The Appellants' invention employs an 
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objective quality metric to determine the candidate vector with the best coefficients for enhancement. 
An error computation having a bias toward spatio-temporal consistency is used in the determination 
of the best candidate enhancement vector. However, the best vector chosen in the Appellants' 
invention contains coefficients for algorithms to enhance pixels in the current field. The vectors in 
the Appellants' invention do not describe a positional relation of the current field with previous or 
subsequent fields. 

The Appellants respectfully submit that de Haan is referring to motion vectors while, in 
contrast, the Appellants' claims are directed to enhancement vectors of enhancement algorithms. As 
such, the Appellants' claims are allowable over de Haan. Accordingly, the Appellants respectfully 
request that the § 102(b) rejection of Claims 1, 2, 4-6, 8-10, 12-14, 17 and 18 be withdrawn and that 
Claims 1, 2, 4-6, 8-10, 12-14, 17 and 18 be passed to allowance. 

ISSUE 2 - REJECTION UNDER 35 U.S.C. § 103 

The rejection of Claims 3, 7, 11, 15, 16, 19 and 20 under 35 U.S.C. § 103(a) is improper and 
should be withdrawn. 

A. OVERVIEW 

Claims 3, 7, 11, 15, 16, 19 and 20 stand rejected under 35 U.S.C. § 103(a) as being 
unpatentable over DE Haan, G., et al., "True-Motion Estimation with 3-D Recursive Search Block 
Matching," IEEE Transactions on Circuits and Systems for Video Processing, vol. 3, no. 5, 
pp. 368-79 (October 1993). ("deHaan"). 
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A copy of the claims is provided in Appendix A. A copy of de Haan is provided in 
Appendix B. 

B. STANDARD 

In ex parte examination of patent applications, the Patent Office bears the burden of 
establishing a prima facie case of obviousness. (MPEP § 2142; In re Fritch, 972 F.2d 1260, 1262, 
23 U.S.P. Q.2d 1 780, 1 783 (Fed. Or. 1992)). The initial burden of establishing a prima facie basis to 
deny patentability to a claimed invention is always upon the Patent Office. {MPEP § 2142; In re 
Oetiker, 977 F.2d 1443, 1445, 24 US.P.Q.2d 1443, 1444(Fed. Cir. 1992); In re Piasecki, 745F.2d 
1468, 1472, 223 U.S.P. Q. 785, 788 (Fed. Cir. 1984)). Only when aprima facie case of obviousness 
is established does the burden shift to the Appellant to produce evidence of nonobviousness. (MPEP 
§ 2142; In re Oetiker, 977 F.2d 1443, 1445, 24 U.S.P.Q.2d 1443, 1444 (Fed. Cir. 1992); In re 
Rijckaert, 9F3dl531, 1532, 28 U.S.P.Q.2d 1955, 1956 (Fed. Cir. 1993)). If the Patent Office does 
not produce a prima facie case of unpatentability, then without more the Appellant is entitled to grant 
of a patent. (In re Oetiker, 977 F. 2d 1443, 1445, 24 U.S.P.Q.2d 1443, 1444 (Fed. Cir. 1992); In re 
Grabiak, 769 F. 2d 729, 733, 226 U.S.P. Q. 870, 873 (Fed. Cir. 1985)). 

A prima facie case of obviousness is established when the teachings of the prior art itself 

suggest the claimed subject matter to a person of ordinary skill in the art. (In re Bell, 991 F.2d 781, 

783, 26 U.S.P.Q.2dl529, 1531 (Fed. Cir. 1993)). To establish zprima facie case of obviousness, 

three basic criteria must be met. First, there must be some suggestion or motivation, either in the 

references themselves or in the knowledge generally available to one of ordinary skill in the art, to 

-10- 
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modify the reference or to combine reference teachings. Second, there must be a reasonable 
expectation of success. Finally, the prior art reference (or references when combined) must teach or 
suggest all the claim limitations. The teaching or suggestion to make the claimed invention and the 
reasonable expectation of success must both be found in the prior art, and not based on Appellant's 
disclosure. (MPEP § 2142). 

C. CLAIMS 3, 7, 11, 15, 16, 19 and 20 

With respect to Claims 3, 7, 11, 15, 16, 19 and 20, the Appellants note that these claims 
depend directly or indirectly form independent claims 1, 5, 9, 13 and 17. As previously argued, 
independent Claims 1 , 5, 9, 1 3 and 1 7 contain unique and novel claim limitations of the Appellants' 
invention not shown in the de Haan reference. Therefore, dependant Claims 3,7, 11,15,16, 19 and 
20 also contain those unique and novel claim limitations. As such, the Appellants respectfully 
submit that the de Haan reference does not teach or suggest all the limitations of the Appellants' 
claims. Thus, the Patent Office has not established a prima facie case of obviousness with respect to 
Claims 3, 7, 1 1, 15, 16, 19 and 20 of the Appellants' invention, and that the rejection of the claims 
under 35 U.S.C. Section 103(a) has thus been overcome. Accordingly, the Appellants respectfully 
request that the rejection be withdrawn and that Claims 3, 7, 11, 15, 16, 19 and 20 be passed to 
allowance. 
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CONCLUSION 



The Appellants have demonstrated that the present invention as claimed is clearly 
distinguishable over the prior art cited of record. Therefore, the Appellants respectfully request the 
Board of Patent Appeals and Interferences to reverse the final rejection of the Examiner and instruct 
the Examiner to issue a notice of allowance of all claims. 

The Appellants have enclosed a check in the amount of $330.00 to cover the cost of this 
Appeal Brief. The Appellants do not believe that any additional fees are due. However, the 
Commissioner is hereby authorized to charge any additional fees (including any extension of time 
fees) or credit any overpayments to Davis Munck Deposit Account No. 50-0208. 



P.O. Drawer 800889 

Dallas, Texas 75380 

(972) 628-3600 (main number) 

(972) 628-3616 (fax) 

E-mail: wmunck@davismunck.com 



Respectfully submitted, 



Davis Munck, P.C. 





William A. Munck 
Registration No. 39,308 
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APPENDIX A 
PENDING CLAIMS 

1 . A receiver, including a video enhancement mechanism for enhancing video 
information with spatio-temporal consistency comprising: 

at least one enhancement unit enhancing a characteristic other than position of a selected 
pixel region of video information utilizing at least one candidate enhancement vector of 
enhancement algorithms to generate an enhanced pixel region for each candidate enhancement 
vector, each said enhanced pixel region equivalent to enhancement of said selected pixel region 
utilizing a respective candidate enhancement vector of enhancement algorithms; and 

a selection unit computing an error for each said enhanced pixel region utilizing a bias 
towards spatio-temporal consistency of a respective enhanced pixel region with spatially adjacent 
pixel regions in a picture containing said selected pixel region and with a counterpart pixel 
region in one or more pictures successive with said picture containing said selected pixel region, 
said selection unit selecting an enhanced pixel region having a best enhancement for spatio- 
temporal consistency. 

2. A receiver as set forth in Claim 1 wherein said at least one candidate enhancement 
vector is selected from enhancement vectors determined to produce a best enhancement for 
spatio-temporal consistency in enhancing pixel regions within a spatial and temporal 
neighborhood of said selected pixel region. 

-1- 
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3. A receiver as set forth in Claim 1 wherein said bias towards spatio-temporal 
consistency further comprises first and second penalties, said first penalty varying based upon 
coefficients for each candidate enhancement vector and said second penalty varying for each 
candidate enhancement vector. 

4. A receiver as set forth in Claim 3 wherein said error is computed on a per-pixel 
region basis for each pixel region within said video information and for each candidate 
enhancement vector for a respective pixel region. 

5. A high definition television receiver comprising: 
a input connection receiving video information; 

a display on which enhanced images derived from said video information are displayed; 

and 

an video enhancement mechanism for enhancing said video information with spatio- 
temporal consistency comprising; 

at least one enhancement unit enhancing a characteristic other than position of a selected 
pixel region of video information utilizing at least one candidate enhancement vector of 
enhancement algorithms to generate an enhanced pixel region for each candidate enhancement 
vector, each said enhanced pixel region equivalent to enhancement of said selected pixel region 
utilizing a respective candidate enhancement vector of enhancement algorithms; and 
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a selection unit computing an error for each said enhanced pixel region utilizing a bias 
towards spatio-temporal consistency of a respective enhanced pixel region with spatially adjacent 
pixel regions in a picture containing said selected pixel region and with a counterpart pixel 
region in one or more pictures successive with said picture containing said selected pixel region, 
said selection unit selecting an enhanced pixel region having a best enhancement for spatio- 
temporal consistency. 

6. The receiver as set forth in Claim 5 wherein said at least one candidate 
enhancement vector of enhancement algorithms is selected from enhancement vectors 
determined to produce a best enhancement for spatio-temporal consistency in enhancing pixel 
regions within a spatial and temporal neighborhood of said selected pixel region. 

7. The receiver as set forth in Claim 5 wherein said bias towards spatio-temporal 
consistency further comprises first and second penalties, said first penalty varying based upon 
coefficients for each candidate enhancement vector and said second penalty varying for each 
candidate enhancement vector. 

8. The receiver as set forth in Claim 6 wherein said error is computed on a per-pixel 
region basis for each pixel region within said video information and for each candidate 
enhancement vector for a respective pixel region. 
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9. For use in a receiver, a method of enhancing video information with spatio- 
temporal consistency comprising: 

enhancing a characteristic other than position of a selected pixel region of video 
information utilizing at least one candidate enhancement vector of enhancement algorithms to 
generate an enhanced pixel region for each candidate enhancement vector, each enhanced pixel 
region equivalent to enhancement of the selected pixel region utilizing a respective candidate 
enhancement vector of enhancement algorithms; 

computing an error for each enhanced pixel region utilizing a bias towards spatio- 
temporal consistency of a respective enhanced pixel region with spatially adjacent pixel regions 
in a picture containing the selected pixel region and with a counterpart pixel region in one or 
more pictures successive with the picture containing the selected pixel region; and 

selecting an enhanced pixel region having a best enhancement for spatio-temporal 
consistency. 

10. The method as set forth in Claim 9 wherein the step of enhancing a characteristic 
other than position of a selected pixel region of video information utilizing at least one candidate 
enhancement vector of enhancement algorithms to generate an enhanced pixel region for each 
candidate enhancement vector further comprises: 

selecting the at least one candidate enhancement vector of enhancement algorithms from 
enhancement vectors determined to produce a best enhancement for spatio-temporal consistency 
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in enhancing pixel regions within a spatial and temporal neighborhood of the selected pixel 
region. 

1 1 . The method as set forth in Claim 9 wherein the step of computing an error for 
each enhanced pixel region utilizing a bias towards spatio-temporal consistency of a respective 
enhanced pixel region with spatially adjacent pixel regions in a picture containing the selected 
pixel region and with a counterpart pixel region in one or more pictures successive with the 
picture containing the selected pixel region further comprises: 

adding first and second penalties to the error as the bias, the first penalty varying based 
upon coefficients for each candidate enhancement vector and the second penalty varying for each 
candidate enhancement vector. 

12. The method as set forth in Claim 1 1 wherein the step of computing an error for 
each enhanced pixel region utilizing a bias towards spatio-temporal consistency of a respective 
enhanced pixel region with spatially adjacent pixel regions in a picture containing the selected 
pixel region and with a counterpart pixel region in one or more pictures successive with the 
picture containing the selected pixel region further comprises: 

computing the error on a per-pixel region basis for each pixel region within the video 
information and for each candidate enhancement vector for a respective pixel region. 
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13. A computer program product within a computer usable medium for enhancing 
video information with spatio-temporal consistency comprising: 

instructions for enhancing a characteristic other than position of a selected pixel region of 
video information utilizing at least one candidate enhancement vector of enhancement algorithms 
to generate an enhanced pixel region for each candidate enhancement vector, each enhanced 
pixel region equivalent to enhancement of the selected pixel region utilizing a respective 
candidate enhancement vector of enhancement algorithms; 

instructions for computing an error for each enhanced pixel region utilizing a bias 
towards spatio-temporal consistency of a respective enhanced pixel region with spatially adjacent 
pixel regions in a picture containing the selected pixel region and with a counterpart pixel region 
in one or more pictures successive with the picture containing the selected pixel region; and 

instructions for selecting an enhanced pixel region having a best enhancement for spatio- 
temporal consistency. 

14. The computer program product as set forth in Claim 13 wherein the instructions 
for enhancing a characteristic other than position of a selected pixel region of video information 
utilizing at least one candidate enhancement vector of enhancement algorithms to generate an 
enhanced pixel region for each candidate enhancement vector further comprise: 

instructions for selecting the at least one candidate enhancement vector of enhancement 
algorithms from enhancement vectors determined to produce a best enhancement for spatio- 
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temporal consistency in enhancing pixel regions within a spatial and temporal neighborhood of 
the selected pixel region. 

15. The computer program product as set forth in Claim 14 wherein the instructions 
for computing an error for each enhanced pixel region utilizing a bias towards spatio-temporal 
consistency of a respective enhanced pixel region with spatially adjacent pixel regions in a 
picture containing the selected pixel region and with a counterpart pixel region in one or more 
pictures successive with the picture containing the selected pixel region further comprise: 

instructions for adding first and second penalties to the error as the bias, the first penalty 
varying based upon coefficients for each candidate enhancement vector and the second penalty 
varying for each candidate enhancement vector. 

16. The computer program product as set forth in Claim 15 wherein the instructions 
for computing an error for each enhanced pixel region utilizing a bias towards spatio-temporal 
consistency of a respective enhanced pixel region with spatially adjacent pixel regions in a 
picture containing the selected pixel region and with a counterpart pixel region in one or more 
pictures successive with the picture containing the selected pixel region further comprise: 

instructions for computing the error on a per-pixel region basis for each pixel region 
within the video information and for each candidate enhancement vector for a respective pixel 
region. 
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17. A video information signal comprising: 

a data stream containing one or more pictures; and 

at least one enhanced pixel region within at least one of said pictures, each enhanced 
pixel region derived from received video information by enhancing a characteristic other than 
position of a selected pixel region of said received video information utilizing at least one 
candidate enhancement vector of enhancement algorithms to generate a candidate enhanced pixel 
region for each candidate enhancement vector, each candidate enhanced pixel region equivalent 
to enhancement of said selected pixel region utilizing a respective candidate enhancement vector 
of enhancement algorithms, 

wherein each enhanced pixel region within a respective picture has a best enhancement 
for spatio-temporal consistency among said candidate enhanced pixel regions for an error 
utilizing a bias towards spatio-temporal consistency of said respective enhanced pixel region 
with spatially adjacent pixel regions in a picture containing said selected pixel region and with a 
counterpart pixel region in one or more pictures successive with said picture containing said 
selected pixel region. 

18. The video information signal as set forth in Claim 17 wherein said at least one 
candidate enhancement vector is selected from enhancement vectors determined to produce a 
smallest computed error value in enhancing pixel regions within a spatial and temporal 
neighborhood of said selected pixel region. 
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19. The video information signal as set forth in Claim 17 wherein said bias towards 
spatio-temporal consistency comprises first and second penalties, said first penalty varying based 
upon coefficients for each candidate enhancement vector and said second penalty varying for 
each candidate enhancement vector. 

20. The video information signal as set forth in Claim 19 wherein each said enhanced 
pixel region within any picture is selected utilizing said error computed on a per-pixel region 
basis for each pixel region within said received video information and for each candidate 
enhancement vector for a respective pixel region. 
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True-Motion Estimation with 3-D Recursive 

Search Block Matching 

Gerard de Haan, Paul W. A. C. Biezen, Henk Huijgen, and Olukayode A. Ojo 



Abstract — A new recursive block-matching motion estimation 
algorithm with only eight candidate vectors per block is pre- 
sented. A fast convergence and a high accuracy, also in the 
vicinity of discontinuities in the velocity plane, was realized with 
such new techniques as bidirectional convergence and conver- 
gence accelerators. A new search strategy, asynchronous cyclic 
search, which allows a highly efficient implementation, is pre- 
sented. A new block erosion postprocessing proposal further 
effectively eliminates block structures from the generated vector 
field. Measured with criteria relevant for the field rate conver- 
sion application, the new motion estimator is shown to have a 
superior performance over alternative- algorithms, while its com- 
plexity is significantly less. 



L Introduction 

VARIOUS algorithms for consumer display scan rate 
conversion have been proposed [l]-[8], but their 
common drawback is decreased dynamic resolution. The 
alternative provided by motion compensation techniques 
[9]-[12] seems, due to the complexity, of the motion esti- 
mator, to be by far too expensive for consumer television 
applications, probably for a long time to come. The exist- 
ing simpler, albeit still expensive [13], motion estimation 
algorithms [14]— [16], on the other hand, cause artifacts in 
the up-converted images that are considered to be worse 
than the blur due to nonmotion compensated field rate 
doubling. To illustrate the above, Fig. 1 shows the blur 
effect of a field repetition algorithm on a sequence, show- 
ing a girl (Renata) moving in front of a zooming and 
. panning camera. With motion compensation techniques 
this blur, which increases with increasing speed^ can be 
eliminated in many picture parts, as shown in Fig. 2. 
However, the perceptual effect of the errors, which is 
clearly visible in Fig. 2, started the thinking about a new 
motion estimation method specifically suitable for field 
rate conversion. It was concluded from the pictures in 
Figs. 1 and 2 that a strong local distortion seems worse 
than a global degeneration, even when the latter option 
yields a significantly higher mean squared error. The 
.starting hypothesis for the present research, therefore, 
was that the generated velocity field should be smooth in 
the first place, and that accuracy, at least initially, could 
be considered to be of secondary importance. Conse- 
quently, the algorithm must contain elements that impose 
a smoothness constraint on the resulting motion vectors. 

Manuscript received December 12, 1992; revised June 2, 1993. 
The authors are with the Visual Communication Systems Group, 
Philips Research Laboratories, 5600 JA Eindhoven, the Netherlands. 
IEEE Log Number 9211372. 




Fig. 1. Motion blur with a field repetition algorithm is clearly notice- 
able, but not very annoying. 




Fig. 2. With motion compensation the artifacts can be very unnatural. 
In this example the OTS algorithm of [16] was used. 



On the other hand, the additional constraints should not 
lead to a computationally complex algorithm, such as that 
known from computer vision research [17]. The hardware 
consequences, if possible, should be taken into considera- 
tion from the very beginning of the algorithm design. 

II. 1-D Recursive Search 

The block-matching algorithms that are most attractive 
for VLSI implementation limit the number of candidate 
vectors to be evaluated. This can be realized through 
recursively optimizing a previously found vector (predict- 
ion). This vector can be a spatially [16] or a temporally 
neighboring result [15]. The performance of these algo- 
rithms in the application of field rate conversion, however, 
proves to be rather poor. Fig. 2, e.g., was obtained apply- 
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ing the one-at-a-time search of [16]. The resulting smooth- 
ness of these algorithms is insufficient; this is assumed to 
be caused by the large number of evaluated candidate 
vectors, located around the spatial or temporal prediction 
value. This can cause strong deviation from this predic- 
tion, i.e., inconsistencies in the velocity field, as the vector 
selection criterion applied in block matching (minimum 
match error) is no guarantee for obtaining true motion 
vectors. A more successful algorithm for obtaining true 
motion vectors, phase plane correlation [18]-[20], allows 
the match-error criterion to select only between a very 
limited number of very likely candidate vectors. Similarly, 
if spatially or temporally neighboring displacement vec- 
tors are believed to reliably predict the displacement, a 
recursive algorithm should enable true motion estimation, 
if the amount of updates around the prediction vector is 
limited to a minimum. The updates were originally .[21] 
arranged around the prediction value, similar to the can- 
didate set CS N (X, t) in the last (Nth) step of 2-D loga- 
rithmic search, according to [22]. The spatial prediction 
however, was excluded from the candidate set: 

CSl(X,t) - 

where L is the update length, which is measured on the 
frame (pixel) grid, X — (X,Y) T is the position on the 
block grid, t is time, and the prediction vector D l ~ l (X, t) 
is selected according to: 



experiments indicate little sensitivity of the algorithm to 
this parameter. The prediction vector D'~ l (X,t) is a 
previously calculated displacement vector taken from a 
position spatially adjacent to the current block. Many 
options can be thought of that are completely analogous 
to options mentioned in, e.g., [23], and [24] for the pel-re- 
cursive algorithms: The result from the block at the 
left-hand side, the block above, the best of these two, or a 
weighted average of resulting vectors in a spatial neigh- 
bourhood can be considered. We will return to the subject 
later and conclude here with the general relation between 
the (one) spatial prediction vector (from now on indicated 
with S(X,t) rather than D^KXj)), and the vectors 
previously calculated: 



S(Xj) =D(X-SD,t) 



(7) 



where SD (SD X >z 0 and SD y >: 0) points from the center 
of the block from which the prediction vector is taken to 
the center of the current block. 

III. 2-D Recursive Search 

A recursive search block-matching algorithm as dis- 
cussed thus far has the drawback of pne-dimensional 
convergence. Hence, at boundaries of moving objects a 
run-in occurs, which visibly deteriorates the contours when 
the vectors are applied for temporal interpolation of pic- 
tures. It is already well known from pel-recursive tech- 
niques (e.g., [23]) that, with predictions calculated, from a 
two-dimensional area or even a (motion compensated) 
three-dimensional space, the convergence can be im- 
proved considerably. In this section, a new two-dimen- 



D'-Kx^t) 



{0}, 

{C e CS l - \(X, t)\*(C, X, t) <e(F , X, /), Vf e CS" ! U, 0}, 0' > D 



(2) 



and the candidate set is limited to a set CS max : 

C5 m " = [C\ - N < C x < +AT, -M <C y < +M} (3) 

The resulting estimated displacement vector D(x, r), which 
is assigned to all pixel positions, x - (x, y) T , in the block 
B(X) of size X * Y with center X: 

B(X) = {x\X x -X/2<x<X x + X/2 A X y 

-Y/2<y <X y + Y/2) (4) 

equals the candidate vector C(X,t) with the smallest 
error AC;X, r): 

Vx eB(X): D(x,t) e {C e CS'(^, r)|,(C, X,t) 

<^(f,^,0,VFGC5'U,r)} (5) 

Errors are calculated as summed absolute differences 
(SAD): 

'(C,A\0= £ \ F ^ it )-f(^^ Cj t-nJ)\ (6) 

where F(x t t) is the luminance function and T the field 
period. The block size is fixed to X = / = S, although 



sional prediction strategy is introduced that, in contrast to 
known techniques, does not dramatically increase the 
complexity of the hardware. The fundamental difficulty 
with a one-dimensionally recursive algorithm is that it 
cannot cope with discontinuities in the velocity plane, as 
those occurring particularly at the boundaries of moving 
objects. The first impression may be that smoothness 
constraints exclude good step response in a motion esti- 
mator design. The dilemma of combining smooth vector 
fields with a good step response can be circumvented, 
however, as will be shown in this section. The method was 
earlier published in [21] and [25]. 

Let us assume that the discontinuities in the velocity 
plane are spaced at a distance that enables convergence 
of the recursive block matcher in between two discontinu- 
ities. When this assumption is satisfied, the recursive 
block matcher yields the correct vector value at the first 
side of the object boundary and starts converging at the 
opposite (second) side. The convergence direction here 
points from side one to side two. Either side of the 
contour can be estimated correctly, depending on the 
convergence direction chosen, though not both simultane- 
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ously. Therefore, it seems attractive to concurrently apply 
two estimator processes, as indicated in Fig. 3, with oppo- 
site convergence directions. It can then be decided which 
of these two estimators yields the correct displacement 
vector at the output. As a selection criterion, the already 
available SAD of both vectors can be used. Bidirectional 
convergence, hereinafter referred to as 2-D C, can then 
be formalized as a process that yields a displacement 
vector: 



" \D>(X>0, (*(D„X,t) >*(D b9 X,t)) 
where 



(8) 



*(D.,X,0= E \F(x,t) ~F(x-D a ,t- T)\ (9) 

xeB(X) 



and 

*(D b ,X,t)* 



L \F(x,t) -F(x-D b1 t -T)\ (10) 



while D a and D b are found in a spatial recursive process 
as described in Section II, updating, respectively, predic- 
tion vectors S a (X, t); 



and S b (X,t): 



where 



S a (X,0=D a {X-SD a9 t) 



S b (X,t)=D b (X-SD bt t) 



SD„ * SD h 



(11) 



(12) 



(13) 



As indicated in (13), the two estimators have unequal 
spatial recursion vectors £D. If the two convergence di- 
rections are opposite (or at least different, as will be 
discussed later), then 2-D C solves the run-in problem at 
the boundaries of moving objects. This is because one of 
the estimators will have converged already at the position 
where the other is yet to do so. Hence, the concept 
combines the consistent velocity field of a recursive pro- 
cess with the fast step response as required at the con- 
tours of moving objects. The attractiveness of a conver- 
gence direction varies significantly for hardware. Refer- 
ring to Fig. 4, the predictions taken from blocks 1, 2, or 3, 
are convenient for hardware. Block 4 is less attractive, as 
it complicates pipelining of the algorithm (the previous 
result has to be ready before the next can be calculated). 
Block 5 is not attractive, as the causality problem must be 
solved by reversing the line scan. This costs several line 
memories in the hardware. Finally, blocks 6, 7, and 8 are 
totally unattractive, as they require field memories for 
reversing the vertical scan direction. When applying only 
the preferred blocks, the best implementation of 2-D C 
results with predictions from blocks 1 and 3 for estimators 
a and b, respectively. The angle between the convergence 
directions can be enlarged by taking predictions from 
blocks P and Q in Fig. 4 (equally attractive for hardware), 



Scatkxtaiy backpotmd U • Q 




Fig. 3. The bidirectional convergence principle. 
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Fig. 4. Locations around the current block, from which the estimation 
result could be used as a spatial prediction vector. 



though the spatial prediction distance then increases. This 
was, however, found to perform worse experimentally 
than 2-D C with the predictions taken from blocks 1 and 
3. A solution for the suboptimal situation of the conver- 
gence directions turns out to be necessary and will be 
discussed in the next section. Fig. 5 shows the conver- 
gence directions for the preferred design, applying the 
most convenient spatial predictors. 

IV. 3-D Recursive Search 

Both estimators (a and b) in the algorithm thus far 
produce four candidate vectors each (C„) by updating 
their spatial predictions S a (Xjt) and S b (X, t). The best 
candidate, selected with the SAD criterion, is assigned as 
the resulting displacement vector D(x, t) to all pixels in 
the block B(X) of size X * Y and centered at X in the 
present field. The spatial predictions were selected to 
yield two perpendicular diagonal convergence axes: 

5 fl (JTi')=a{2T-(y)^). 

S>(X,t)=D b [x- ("/)><)• 



(14) 



The suboptimal situation arises from the assumed diffi- 
culty in obtaining convergence, which is (partly) opposite 
to the scanning of the pictures. There is a method that 
realizes convergence from the bottom to the top of the 
screen (see [26]). Predictions can be taken (e.g., from 
spatial positions 6-8 in Fig. 4) without causality problems 
if the results from a previously calculated vector field are 
used. The underlying assumption is that the displacements 
between two consecutive velocity planes, due to move- 
ments in the picture, are small compared to the block size. 
This assumption enables the definition of a third and a 
fourth estimator, c and d. in addition to the estimators a 
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Fig. 5. Location' of the spatial predictions of estimators a and b with Fig. 6. The relative positions of the spatial predictors S a and S b and 
respect to the current block. The arrows indicate the convergence the convergence accelerators £, and re- 

directions. 



and b and also generating four candidate vectors by 
updating the prediction value. Selecting the predictions 
for estimators c and d from positions 6 and 8 in Fig. 4, 
respectively, yields additional convergence directions ex- 
actly opposite to those of estimators a and b. The result- 
ing design, however, is not symmetrical in its convergence 
behaviour, as the convergence speed is much lower due to 
the temporal component in the prediction delays of esti- 
mators c and d. Recognizing this limitation, there are 
simpler options to benefit from candidates taken from a 
previous vector field. Rather than choosing the additional 
estimators c and d, it is suggested here to apply vectors 
from positions opposite to the spatial prediction position . 
as additional candidates in the already defined estimators. 
This saves hardware, as fewer errors have to be calcu- 
lated. Furthermore, working with fewer candidates re- 
duces the risk of inconsistency [27], The concept now is 
that a fifth candidate in each original estimator, a tempo- 
ral prediction value from the previous field (7^ and T b for 
estimators a and b, respectively), accelerates the conver- 
gence of the individual estimators by introducing a look 
ahead into the convergence direction. These convergence 
'accelerators CA's are not taken from the corresponding 
block in . the previous field (D(X, t — D), but from a 
block shifted diagonally over r blocks and opposite to the 
blocks from which the spatial predictions S Q and S b re- 
sult: 

£QT,0 = + r). (15) 

Increasing r implies a larger look ahead, but the reliabil- 
ity of the prediction decreases with increasing spatial 
distance (r), as the correlation between the vectors in a 
velocity plane can be expected to drop with increasing 
distance, r = 2 has been experimentally found to be best 
for a block size of 8*8 pixels. The resulting relative 
positions are drawn in Fig. 6. For the resulting 3-D RS 
block-matching algorithm, the displacement vector D(x, t) 
is calculated according to equations (8), (9) 7 and (10), 
where D c (X,t) and D b (X,t) result from estimators a and 

h rp<;np.r.r ivp.lv r.alr.i i la tp.H in narallpl u/irh rhp. ranrliHatp 



set CS ' 



(16) 



and CS b : 



CS b (X,t)~ (ce CS m "|C = ( *),/) + U, 



(17) 



while errors are assigned to candidate vectors using the 
SAD criterion of equation (6). Due to the faster conver- 
gence, the CA's are particularly advantageous at the top 
of the screen, where the spatial process starts converging. 
Furthermore, they improve the temporal consistency. 

V. Updating Strategy 

The optimization of the constant L (the update length) 
in the definitions (16) and (17) of the candidate sets of 
each estimator CS a {X, t) arid CS b (X, t) reveals some con- 
flicting demands. A small integer value of the update 
allows convergence to accurate output vectors, whereas a 
relatively high value is expected to enable quicker conver- 
gence. It is possible to circumvent the choice by adapting 
L individually for estimators a and b, e.g., to the SAD 
error value of the spatial prediction. This option was 
indicated in [21]. Alternatives, however, were investigated 
(see also [26] and [28]). 

It is conceivable to bring down the number of candidate 
vectors deviating from 0 to the very limit, one only. Any 
other number is possible as well without losing either of 
the aforementioned options on the average or adding 
excessive complexity to determine the candidates. Apart 
from the zero update, only one update vector U{X,i) is 
added to the spatial predictions S a {X,t) and S h (Xj) in 
both estimators a and b. The components of the second 
update originally were generated with two independent 
nwiHnranHnm oenp.rarors. However, evneriments indicate 
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that there is no need for these random generators, as a 
. simple cyclic update generator realized with a counter and 
a look-up table (LUT) already yields good results, particu- 
larly when the counter runs asynchronously with the scan- 
ning of the blocks in a picture. For this final version of the 
3-D RS block matcher that will be used in the evaluation 
section, the displacement range again is limited to C5 max : 

C5 max = {C| - N < C x < +N, -M < C y < +M} (18) 

and the proposed candidate set CS a (X, t) for estimator a 
applying this updating strategy, hereinafter referred to as 
asynchronous cyclic search (ACS), is found as: 

CS a (X,t) =|c g CS™\C=&[x-fy,t} + 

u|D^+2-(^),f-r),oJ (19) 

where the update vectors UJ,X, t) are found as 

££(£,0 e {0,lut(^ / (^ ? r)mod p)) (20), 

where N bl is the output of a block counter, lut is a 

look-up table function, and p is not a factor of the 
number of blocks in a picture. For estimator b, similarly, 
the candidate set CS b (X, t) is found as 

CS b (X y t) = jceCS m "|C (~y)>') 

u|D(^+2-(-^),r-r),oJ (21) 

while, possible from the same LUT, updates U b { X, t) are 
generated as 

e {Q,|ut((iV w (Z,r) + offset) mod p)} (22) 

which differ from UJ.X, t) due to the integer offset added 
to the value of the block rate counter. The estimator n 
{n ~ a, b) yields a vector D n (X t t) chosen from the candi- 
date set CS n (X, t) [see (19) and (21)] such that it mini- 
mizes the matching error: 

AC,X,t) = £ \F(x,t) -F(x-C,t- T)\ (23) 

where the match error is summed over a block B{X\ 
defined as 

B(X) = [x\X x -X/2<x<X £ + X/2, X y 

-7/2 <y <X y + 7/2}. (24) 

The best of the two vectors resulting from estimators a 
and b is selected in the output multiplexer and assigned to 
all pixels in B(X) [see (8)]. 

The vector 0 in the candidate sets CS a (X,t) and 
CS b (X,t) [(19) and (21)] improves the performance for 
small stationary image pans but introduces a serious risk 



of disturbing the convergence. In Section VI we will 
return to this issue. The optimal search area will most 
likely reveal a symmetrical distribution of candidate vec- 
tors around the spatial prediction vector, as there is no 
preferred velocity direction. This can be translated into 
the a priori expectation that the output of the generators 
(the modulo p counter plus LUT) on the average should 
be zero. Further, the variance of the vertical update is 
smaller than that of the horizontal update, as large hori- 
zontal movements seem to occur more frequently than 
large vertical movements. It may be advantageous to 
adapt the distributions at the output of the update gener- 
ator to, e.g., the match errors obtained, in the sense that a 
narrower distribution is chosen in case of small errors. 
These topics can be subject for further research. Good 
results (see the evaluation in Section IX) were obtained 
from an estimator with the ACS strategy where the LUT 
contained the following updates: 

(iMVMSM-.')}- *> 

To realize a symmetrical distribution around 0 with p 
updates, a number of 0 updates and symmetrical pairs of 
the other updates can be added at will to arrive at a value 
of p that is not a factor of the number of blocks in a field 
[30]. 

VI. Further Emphasis on Smoothness 

It happens that blocks shifted over very different vec- . 
tors with respect to the current block contain the same 
information; 'particularly on periodic structures. A block 
matcher therefore will randomly select one of these vec- 
tors due to small differences in the matching error caused 
by noise in the picture. If the estimate is used for tempo- 
ral interpolation, very disturbing artifacts will be gener- 
ated in the periodic picture part. For the 3-D RS block 
matcher, the spatial consistency could guarantee that, 
after reaching a converged situation at the boundary of a 
moving object, no other vectors will be selected. This, 
however, functions only if none of the other candidate 
vectors that yield an equally good matching error are ever 
generated. A number of risks jeopardize this constraint: 

1) An element of the update sets US a (X y t) and 
US b (X 7 t) may equal a multiple of the basic period 
of the structure. 

2) "The other" estimator may not be converged, or may 
be converged to a value that does not correspond to 
the actual displacement. 

3) Directly after a scene change, it is possible that one 
of the convergence accelerators TJ.X, 0 or T b (X, t) 
yields the threatening candidate. 

It is possible to improve the result, as far as risks men- 
tioned under 1) and 3) are concerned, by adding penalties 
to the error function related to the length of the differ- 
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ence vector between the candidate to be evaluated and 
some neighbouring vectors. This was mentioned in [31]. 
For the 3-D RS block matcher a very simple implementa- 
tion is realized with a penalty depending on the length of 
the update: 

= E \F{x,t) -FU-C,t - 7)1 

xeB(X) 

+ a-\\U(X,t)\\ (26) 

For the CA's, this is not applicable, but a fixed penalty 
equal to cr, independent of the difference with S a (X, t) or 
S b (X,t\ proved to give satisfactory results. Experimental 
optimization of a led to a value equal to 0.24% of the 
maximum error value. Other experiments, however, indi- 
cate that fixed penalties for all updates can be applied. 
Optimization led to values for these fixed penalties of, 
respectively, 0.4%, 0.8%, and 1.6% of the maximum error 
Value, for the cyclic update, the convergence accelerator, 
and the fixed 0 candidate vector. This last candidate 
especially requires a large penalty, as it introduces the 
risk of convergence interruption in flat areas. 

The risk mentioned under 2), however, is not reduced 
with these update penalties. The situation described typi- 
cally occurs if a periodic part of an object enters the 
picture from the blanking or appears from behind another 
object in the image. In that situation, one of the two 
estimators can converge to the wrong vector value, since 
there is no boundary moving along with the periodic 
picture part to prevent this. Therefore, an experiment was 
set up, with S a (X, t) set to the value of S b (X, t) if 

*(D a ,X-SD a9 t) >*(D»X-§D b ,t) + Th (27) 

where Th is a fixed threshold, and, reversely S b (X, t) is set 
to the value of S a (X, t) in case 

X-§Db>0 >*(Da,X-SD a9 t) + Th (28) 
This attempt to cope with the problem through linking 
'the two estimators was based upon the following hypothe- 
sis: In the first blocks next to the horizontal blanking 
interval (or the "other object"), one estimator will be 
converged, whereas the other must start from its initial- 
ization value, or from the velocity of the other object. 
Usually this will result in a remarkable difference in the 
match errors e(D a9 X,t) and e(D b ,X,t) of the two esti- 
mators a and b. Substitution of the spatial prediction 
vector of the worst matching estimator by the spatial 
prediction vector of the best matching estimator prevents 
the algorithm from converging to the wrong value. The 
threshold Th in (27) and (28), above which the predictions 
are linked, turned out to be useful, as the advantage of 
two independent estimators would otherwise be lost. In 
[29], an adapted version of the penalties of updating is 
mentioned, applying coarser quantization of the match 
errors in an attempt to arrive at the same goal. 



duces visible block structures in the interpolated picture. 
The block sizes commonly used in block matching are in a 
range that gives rise to very visible artifacts [32]. A post- 
filter on the vector field can overcome this problem. The 
option was mentioned in [15] but has the drawback that 
discontinuities in the vector field are blurred as well. 
Therefore a post-operation is introduced in this section; it 
eliminates fixed block boundaries from the vector field 
without blurring contours. Furthermore, an option is found 
that prevents vectors that did not result from the estima- 
tor from being generated. This is attractive mainly for 
algorithms that yield vectors that have a poor relation- to 
the actual object velocities. In case of a full search method, 
for example, it is not unlikely that the average of two 
vectors yielding a low match error will result in a vector 
that gives a bad match. The method was published in [26] 
and [33]. In case of a velocity field for which one vector 
per block is available, it is proposed to divide each block 
B(X) 

B{X) = [x\X x - X/2 < x. < X X 

+X/2 A X y - Y/2 < y < X y + 7/2} (29) 

to which a vector D(x, t) is assigned, into four sub-blocks 
Bij(X\ 

( X 
BijiX) = U\X X - (1 - z) • — < x < X z + (1 + i) 

X Y Y\ 

■— KX y - (1 -j) ■ - <y <X y + (1 + ]')■ - J (30) 

where the variables i and j take the values + 1 and - 1. 
To the pixels in each of the four sub-blocks B Uj (X) a 
vector D, fx, t) is assigned: 

VxeBijiXhDuix.t) -D,j(X,t), i,j = 1, -1 . 

(31) 



where 

D u (X,0 = med 



The rried function is a median on the x and y vector 
components separately: 



medU,Y,Z) - 



' median^, y j? Z,) 
^ median^,, y r Z^ 



(33) 



VII. Block Erosion to Eliminate Blocking 
Effects 



Because of this separate operation, a new vector that was 
neither in the block itself nor in the neighbouring blocks, 
If the motion information is limited to one vector per can be created. To prevent this, it could be checked 

Mnrl' nf niy^U morion rnmnpncatinn cnmptimi^c inrrn- u/hpthpr iUf> npvu uprfnr ic pnn^l tr\ nnp of fhp fhrp.P inniit 
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vectors. And in case 

^(^(^ + '"-(^)^).^(J,') > D(Z+;)-(°),f)J 

« (£((2r+'--(o),'),w,o ) 

+ (34) 

to the pixels in the quadrant the original vector is as- 
signed: 

Vxg^U): D u (x,t) =D(X,t). ■ (35) 

Fig. 7 illustrates the process, showing with shading the 
areas from which the vectors are taken to calculate the 
result for sub-block H_j _ j the neighbouring blocks E and 
G, and block H itself. The block erosion can be repeated 
for the sub-blocks in case of large initial blocks. Each 
sub-block is then again subdivided into four parts. 

VEIL Evaluation Tools 

A. The M2SE Quality Indicator for Estimated Vectors 

To indicate the performance of a motion estimator, 
often a mean square prediction error (MSE) is used, or 
the entropy of the prediction error. Both are objective 
and relevant for coding applications of motion estimation. 
For field rate conversion applications, however, these 
measures, as already mentioned in the introduction, are 
not very relevant. Therefore, in this section a first attempt 
to arrive at objective motion estimator evaluation tools 
for field rate conversion is presented. As the proposed 
criteria are' new, their assumed validity will be illustrated 
with pictures from vector fields, which allow a subjective 
comparison of the performance of some of the best algo- 
rithms. 

The first proposed performance indicator is a modified 
mean square prediction error (M2SE). The quintessence 
of the modification is that the validity of the vectors is 
extrapolated outside the temporal interval on which they 
are calculated. The extrapolation, because of object iner- 
tia, is expected to be more legitimate if the vectors repre- 
sent true motion than if they indicate only a good match 
between blocks of pixels. 

As illustrated in Fig. 8, displacement vectors D(x t t) are 
calculated between the previous field at time / — T and 
the present field at t, according to the definition given in 
(5) and (6). With vectors so defined, output sequences are 
created by interpolating each output field as the motion 
compensated average from two successive input fields, 
using displacement vectors from various motion estima- 
tion algorithms under evaluation. Interpolated output 
fields are thus found as 

F mc (xj) = \-{F(x-D(x,t),t- T) 

+ F(x + DU,/) ; r + T$). (36) 

To calculate the proposed performance indicator, the 
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Fig. 7. The center block H divided into four sub-blocks, H;j, i ~ +/1 
and j = + 




Fig. 8. A motion compensated average is calculated of fields at instants 
t — T and t + 7\ applying vectors estimated berween field / and t — T. 

squared pixel differences between the interpolated output 
and the original input field are summed over a field 
excluding a boundary area and normalized with respect to 
the number of pixels in this measurement window. Fur- 
ther, the resulting figures, obtained from five different 
input test sequences, are averaged. Hence the resulting 
M2SE performance criterion can.be written as 

M2SE(r) = 1 • E -L • £ t^f.O - F mc U,t)f 

(37) 

where the index s identifies the test sequence (1, 2, 3, 4, 
or 5) to which the luminance function F s {x, t) belongs and 
on which also F mc (x, t) is calculated. P • L is the number 
of pixels in the measurement window MW that equals the 
entire image, excluding a margin of N pixels wide hori- 
zontally and M vertically, where M and N define the 
vector range of the motion estimator according to (3). To 
enable convergence of algorithms that include the use of 
temporal predictors, it is proposed to calculate the M2SE, 
defined in (37), in the fourth field of each sequence. This 
seems reasonable, as the human observer also needs some 
time to interpret pictures after a scene cut. The five test 
sequences were selected to provide critical test material _ 
and include several categories of difficulties. This broad 
range of difficulties was assumed to make the M2SE more 
meaningful, though the inhomogeneous data set obviously 
implies a rather high variance on the figures. 

The low velocities and the -fine details of the Renata 
test sequence (see Fig. 1) make it most likely that algo- 
rithms will be well converged in the fourth field, so that 
the M2SE measure, for this sequence, will indicate the 
accuracy of a method. This sequence was also applied in a 
version accelerated three times (highest velocity around 
12 pel/T) to provide the second sequence in the test. 
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Now the fine details and the large difference in fore- 
ground/background velocity provide a critical test for 
covering and uncovering situations and, in case of a 

. recursive algorithm, the convergence speed is tested. The 
Car & Gate sequence (for a picture see Fig. 19) was 
applied as the third test sequence. The old car shown 
drives towards the camera, which is zooming out. The gate 
is closing behind the car. The velocities are fairly low. A 
picture from the fourth sequence, Roller Coaster, is shown 
in Fig. 9. Fairly large displacements, up to 16 pel/7, are 
caused by the fast-moving train while, due to the looping, 
velocities in many directions occur. A stationary image 
part exists close to the fast-moving roller coaster. A pic- 
ture of the fifth test sequence is shown in Fig. 10, the 
BBC-DRUM-TEXT sequence. A stationary text, VALVO 
AL, was superimposed on a picture attached to a rotating 
drum from which a fast (11-12 pel/T), but almost uni- 

'form, motion resulted. The main difficulty here is the 
strong discontinuity in the velocity field, which is caused 
by the superimposed text. It is challenging for algorithms 
that presume smoothness in the displacement vector field. 

B. The Vector Field Smoothness Indicator \'y 

As discussed in the introduction, it was observed for 
motion compensated pictures that inconsistencies in the 
estimated displacement vector field are a major threat to 
quality. Inconsistencies could spoil the result to a level 
where the viewer prefers simple non-motion compensated 
field rate conversion techniques, such as field averaging or 
field repetition. It was concluded that the spatial smooth- 
ness of the velocity field is of major importance. The 
second performance indicator proposed for the evaluation 
of motion estimation algorithms, therefore, is a smooth- 
ness figure 5(0, defined as 

■a.) - £ ' £ ' '"£' f 8 ^ 1 

(38) 

where X runs through all centers of the blocks within the 
fourth field of a test sequence, excluding the boundary for 
obvious reasons. is the number of blocks in a field, and 

A x (X,k,l,t) =D x (X,t) -D X ^X + 

* y (X,k,l,t)«D y (X 9 t) -P y \x+ (*"y).') (39) 

This smoothness figure drops in case the motion estimator 
under test generates a vector field with many inconsisten- 
cies. A high score on this criterion, therefore, qualifies the 
algorithm as good for field rate conversion applications. 

C. The Computational Overhead Indicator 

As a final criterion, a standardized complexity measure 
is defined. It concerns the operations count of the algo- 
rithm, selecting a block size of 3*8 pixels (if applicable), 




Fig. 9. A picture from RoUer Coaster sequence . used in the M2SE test 




Fig. 10. A picture from the BBC-DRUM-TEXT sequence. 



function is used as a unit for the operations count. Sub- 
tracters, comparison, and absolute value are assumed to 
yield the same complexity. Multiplications and divisions 
are supposed to cost 3 ops/pel, as their silicon area is 
approximately three times larger than that of an adder 
stage. This criterion can be seen as a first-order approxi- 
mation of the hardware attractiveness, which makes it 
particularly relevant for consumer display conversion ap- 
plications. 

IX. Evaluation Results 

In addition to the new 3-D RS design, a fairly large 
number of algorithms are selected to provide a reference 
in the evaluation. The evaluated 3-D RS algorithm in- 
cludes the asynchronous cyclic search of Section V. the 
penalties on updating of Section VI, the block erosion of 
Section VII. and sub-sampling on block and pixel level as 
mentioned in [26] in order to further reduce the opera- 

»; » T~ ~ I LI 1. _*_t. ; ..L . J- -1. . c. .n 
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search (FS) was included in the comparison, as this method 
probably yields the best possible result from all nonhierar- 
chical block matchers. In order to include more 
hardware-attractive alternatives, the three-step and the 
four-step logarithmic search and the one-at-a-time search 
(OTS) methods were added to the reference list. Hierar- 
chical methods are known to yield displacement vectors 
that correspond more closely to the true motion of objects 
in the image, and therefore they should reveal a better 
score on our quality scale. The two implementations (H2 
and H3 for the two-level and the three-level, respectively) 
described in [34] were evaluated in the comparison. Phase 
plane correlation (PPC) was designed [18]-[20] at BBC 
Research for the demanding professional (television stu- 
dio) field rate conversion applications. Therefore, it is 
regarded as representing the state of the art in motion 
estimation for field rate conversion as far as quality is 
concerned. The implementation of the PPC algorithm 
used is the one described in [9], though the block size was 
adapted from 16*16 to 8*8 pixels. 

The M2SE scores of the algorithms are shown in Fig. 
11. As can be seen from the figure, FS does not yield the 
best possible score on the M2SE scale, which reflects the 
influence of the modification in the more usual mean 
square prediction error criterion. Two quality groups can 
be distinguished: a high-quality group consisting of the 
PPC, the hierarchical and FS block matchers, and the 3-D 
RS method; and a low-quality group containing the effi- 
cient search algorithms. 

The standard deviation on the M2SE scores is rather ' 
high, 14 in the high-quality category and more than 50 for 
the algorithms in the second group. This corresponds to 
our expectation, as the test sequences are very different in 
order to include many different categories of motion. On 
average, the 3-D RS block matcher's M2SE score is the 
best of all estimators apart from PPC. However, it should 
be noted that the comparison of the algorithms is not 
entirely justifiable for two reasons. A first complication is 
that the hierarchical block matcher H3 is the only estima- 
tor that calculates vectors for each block of size 2*2 
pixels. The other "good" algorithms have a block size of 
8*8. The second obstacle for a fair comparison is that the 
PPC does not estimate between fields / and t - 7, to 
interpolate a result at t using t — T and t + 7", as pro- 
posed in subsection VIII-A. Instead, the implemented 
PPC algorithm estimates the displacements between t — T 
and t + T and uses these vectors to interpolate field r. 
This is believed to give a slight, advantage to the PPC 
method. 

The second performance indicator, the smoothness of 
the vector fields, has to be carefully interpreted, as no 
best smoothness figure can be given. It is, however, rea- 
sonable to classify an algorithm as "better" when its 
smoothness figure is higher, provided its M2SE figure is 
similar or even lower. In this sense, the graph shown in 
Fig. 12 clearly indicates a superior performance, in terms 
of vector consistency, of the 3-D RS block matcher over 
all alternative estimators. This is true not onlv on the 



M2SE score 

for various ME methods 




K3 PPC 3-D R-S 



Fig. 11. Performance comparison of a set of motion estimation algo- 
rithms. To provide further reference, the non-morion compensated field 
average was calculated to yield an M2SE of 8855. 



average; it was found that the smoothness of the 3-D RS 
block matcher was better, for each of the (very different) 
test sequences, than that of any of the other tested 
algorithms. The smoothness score of H3 is omitted, as the 
smoothness criterion was designed for algorithms with 
one vector per block of 8 * 8 or larger. The photographs at 
the. back of this issue show, that the smoothness resembles 
that of PPC. These photographs, Figs. 14 to 19, show for 
two typical pictures the vectors generated with the four 
best-performing algorithms. To this end, the vectors were 
coded as a color, which could be shown as an overlay on 
the pictures. As no completely objective quality measures 
for motion vector fields in the application of field rate 
conversion 'are known, the figures can help illustrate the 
criteria introduced in this paper. Table I shows how colors 
and vector values correspond. 

In the third comparison, the operations count of the 
3-D RS block matcher, as it results from this paper, is 
compared with results from our reference algorithms. As 
Fig. 13 reveils, the 3-D RS design compares favorably, 
also in terms of operations count, with the algorithms 
listed. This is believed to be a consequence of taking the 
hardware considerations into account from the beginning 
of the algorithm design. 

X. Conclusion 

A block-recursive motion estimation algorithm was pro- 
posed in this paper. The bidirectional convergence princi- 
ple enabled combination of the apparently conflicting 
demands for smoothness and yet steep edges in the veloc- 
ity field. The method turns out to be very successful, even 
after adaptation for simple hardware. An additional new 
element is the use of what we have called convergence - 
accelerators, a special type of temporal predictor, which 
are shifted with respect to the current position in order to 
create look-ahead function in the convergence direction. 
A new and very efficient updating strategy, asynchronous 
cyclic search, was introduced. Finally, with block erosion, 
block structures could effectively be removed from the 
resulting vector field. 

Using three new test criteria, the suitability of motion 
estimators for use in consumer television with motion 
compensated field rate doubling was tested. Particularly, 
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Fig. 12. Comparison of the vector field consistency score of various 
block-matching algorithms. 




Operations count 

for various ME methods 




Fig. 15. Motion vectors of the phase plane correlation algorithm, 
shown as a color overlay on the accelerated Renata sequence. 



Fig. 13. Operations count of some estimators. FS and H3 are left out, 
bui they have an operations count of 1895 and 1400 ops/pel, respec- 
tively. 
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Fig. 16. Motion vectors of the three-level hierarchical block matcher, 
shown as a color overlay on the accelerated Renata sequence. 




Fig. 14. Motion vectors of the 3-D RS block matcher, shown as a color 
overlay on the accelerated Renata sequence. 



the smoothness indicator for the estimated vector field 
together with the photographs of the visualized vector 
fields are believed to provide relevant information when 
judging the performance of a motion estimator intended 
for motion compensated field rate conversion. As no 
optima! smoothness figure is known, the M2SE criterion is 
believed to provide a useful instrument to verify that the 
smoothness constraint is not exaggerated. Finally, the 
operations count allowed an early indication of the hard- 
ware attractiveness. 

It can be concluded that the newly designed motion 



Fig. 17. Motion vectors of the full-search block matcher, shown as a 
color overlay on the accelerated Renata sequence. 



the tested block-matching algorithms in the application of 
consumer field rate conversion. The presented vector fields 
leave little room for doubting that the 3-D RS estimator 
outperforms all tested algorithms in the comparison, while 
the operations count suggests that implementation is pos- 
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Fig. 18. Motion vectors of the 3-D RS block matcher, shown as a color 
overlay on the Car & Gate sequence. 




Fig. 19. Motion vectors of the phase plane correlation algorithm, 
shown as a color overlay on the Cai & Gate sequence. 



TABLE I 

Colors Used to Visualize Motion; Each Color is Used to 
Indicate Positive and Negative Values of a Vector 
Component, but Such that a Minimal Risk of 
Confusion Results 



Vector/ Color-Overlay Coding Table 


Color 


Grev Yellow Cyan Green Violet 


Red Blue 


Positive 


0 12 3 4 


5 > 6 


Negative 


< -6 -.5 -4 -3 


-2 -1 
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